feat(deps): upgrade transformers to 5.x and sentence-transformers to 5.2+#341
Draft
voorhs wants to merge 7 commits into
Draft
feat(deps): upgrade transformers to 5.x and sentence-transformers to 5.2+#341voorhs wants to merge 7 commits into
voorhs wants to merge 7 commits into
Conversation
…5.2+ (#295) The 4.57.x mistral-regex codepath called `huggingface_hub.model_info()` on every tokenizer load with vocab >100k (e.g. `intfloat/multilingual-e5-*`), hammering HF's rate limit in CI and in production. transformers 5.0+ caches that probe per-process and respects `local_files_only`/`HF_HUB_OFFLINE`. The bump is necessarily a coordinated two-package migration: ST 5.2.0 is the first release that lifts the `transformers<5.0.0` cap. Resolved versions: transformers 5.12.1, sentence-transformers 5.6.0. Adjusts the v5.x surfaces that actually broke: - ranker.py: `cross_encoder.model.classifier` → `cross_encoder[0].auto_model.classifier` (ST 5 restructured CrossEncoder into a nn.Sequential of modules). - ranker.py: CrossEncoder.predict() renamed `activation_fct` → `activation_fn`. - ranker.py: `cross_encoder.model.cpu()` → `cross_encoder.cpu()` (the wrapper is itself an nn.Module now, no underlying `.model` attribute). - embedder/sentence_transformers.py: import `losses`/`training_args` from `sentence_transformers.sentence_transformer` (top-level path deprecated). - embedder/sentence_transformers.py: `warmup_ratio=` → `warmup_steps=` (v5 TrainingArguments accepts a float <1.0 there as a ratio). - test_sentence_transformers_backend.py: `get_sentence_embedding_dimension()` → `get_embedding_dimension()`. Removes the `_disable_transformers_mistral_regex_patch` workaround from tests/conftest.py — the underlying bug is fixed in v5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Bump sentence-transformers lower bound 5.2.0 → 5.4.0. The new ranker / embedder paths (cross_encoder[0] subscript, sentence_transformers. sentence_transformer subpackage, get_embedding_dimension) all landed in 5.4.0; the previous floor would have ModuleNotFoundError'd / AttributeError'd anyone resolving 5.2.x–5.3.x. - Constrain EmbedderFineTuningConfig.warmup_ratio to (0, 1). v5 TrainingArguments interprets warmup_steps>=1 as a raw step count and <1 as a fraction, so a stray warmup_ratio=1.0 would silently produce one warmup step instead of full-training warmup. - Refresh tests/test_deps.py synthetic metadata fixtures to v5 version strings so the resolver tests exercise the version range we ship, not the v4 range we just left behind. - Trim the v4→v5 narrating comments down to the WHY of the current code; per-line migration history belongs in the commit log. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reviewer flagged that `gt=0` rejects the legal `warmup_ratio=0.0` config (disable warmup). Relax to `ge=0`; `lt=1` is kept because that's the v5 boundary where warmup_steps flips from ratio to raw step count. Regenerate the published JSON schema so it reflects the constraint — otherwise YAML authoring against the schema would pass schema validation and fail at runtime. Pushed back on the reviewer's claim that `warmup_steps=0.1` runs zero warmup: transformers v5 typed `warmup_steps: float` and `get_warmup_steps` branches on `>= 1`, not `> 0` — `0.1` takes the `math.ceil(N * 0.1)` fraction branch (training_args.py:2089 in v5.12.1). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- _bert.py: coerce label2id/id2label keys to str. huggingface_hub 1.x
StrictDataclassFieldValidationError rejects int-keyed label2id; the
v5 AutoModelForSequenceClassification.from_pretrained pipeline now
routes through that validator, so the previous {int: int} mapping
raised on every BertScorer.fit (and cascaded into a fallback
hf_hub_download call that the test guard caught as 'unpinned').
- ranker.py: cast cross_encoder[0] to Any for auto_model.classifier
access (nn.Sequential.__getitem__ is typed Tensor | Module on v5);
add arg-type ignores on CrossEncoder.predict(list[tuple[str,str]])
calls — the v5 stub demands the much wider Sequence type but the
list-of-pairs form is the documented call shape.
- Drop type: ignore comments mypy now reports as unused
(AutoTokenizer.from_pretrained gained a typed stub in transformers
v5; max_length matches TokenizerConfig.max_length cleanly).
- conftest.py: SentenceTransformer's constructor is typed Any on v5,
so add no-any-return ignore at the fixture boundary.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A future refactor sees `{str(i): i}` as a no-op coercion and "simplifies"
back to `{i: i}`; mypy passes, then BertScorer.fit raises
StrictDataclassFieldValidationError at runtime. Comment makes the WHY
explicit at the call site, matching the WHY-only comment policy from
14f9576.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When PEFT is installed, transformers v5 calls find_adapter_config_file
on every AutoModelForSequenceClassification.from_pretrained. The
auto_factory only propagates `_commit_hash` (used for the cache
lookup) but NOT the outer `revision` to the fall-through
hf_hub_download. On a cold cache — i.e. our CI warm-cache job, which
populates model files but no negative marker for adapter_config.json —
that probe fires `hf_hub_download(repo_id, adapter_config.json,
revision=None)` and our test guard rightly flagged it as unpinned.
Pass `adapter_kwargs={"revision": revision}` so the adapter probe
inherits the pin. The first run still writes a `.no_exist` marker, but
all subsequent runs (and CI's pinned-only contract) stay clean.
Reproduces with: rm -rf ~/.cache/huggingface/hub/models--prajjwal1--bert-tiny/.no_exist
then pytest tests/pipeline/test_inference.py::test_inference_from_config[multiclass].
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PEFT's get_peft_model_state_dict (save_and_load.py:380-384) runs an
embedding-resize sanity check on every Trainer.save_checkpoint by
calling model.config.__class__.from_pretrained(base_model_name_or_path)
with no revision. transformers fills in revision='main' as the default,
so the call hits hf_hub_download('prajjwal1/bert-tiny',
'config.json', revision='main') — unpinned, which our CI guard
correctly flags. On a cold cache (CI), this trips on every
LoRA/PTuning trial that runs through Trainer.
Clear base_model_name_or_path on the peft_config after get_peft_model
so the vocab check short-circuits at `if model_id is not None`. Our
dumper (PeftModelDumper / HFModelDumper) saves the base model
separately and the load path passes it explicitly, so the adapter
config doesn't need to remember it.
Reproduces with:
rm -rf ~/.cache/huggingface/hub/models--prajjwal1--bert-tiny/.no_exist
pytest tests/pipeline/test_inference.py::test_inference_from_config[multiclass]
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #295.
Why
transformers4.57.x callshuggingface_hub.model_info()on everytokenizer load for any model with
vocab_size > 100000(e.g. our defaultintfloat/multilingual-e5-*family). That probe is uncacheable, firesthrough the SHA pin, and 429s under CI matrix / parallel load — and
because the tests-only conftest monkey-patch isn't in production code,
real users on the classic/zero-shot-encoder presets hit it too.
transformers5.0+ caches the probe per-process and respectslocal_files_only/HF_HUB_OFFLINE(HF #45444). Pertransformersrelease notes the v5 fix is intentionally not backported to 4.x —
upgrading is the only path.
Scope of the bump
This is necessarily a coordinated two-package migration.
sentence-transformers3.x pins
transformers<5.0.0, and the cap persists through ST 5.1.x;ST 5.2.0 is the first release that lifts it to
transformers<6.0.0.Resolved versions:
transformers==5.12.1,sentence-transformers==5.6.0.Changes
pyproject.toml: bump both extras.src/autointent/_wrappers/ranker.py— ST 5 restructuredCrossEncoderinto a
nn.Sequentialof modules:cross_encoder.model.classifier→cross_encoder[0].auto_model.classifierpredict(activation_fct=...)→predict(activation_fn=...)cross_encoder.model.cpu()→cross_encoder.cpu()(wrapper is itself ann.Module).src/autointent/_wrappers/embedder/sentence_transformers.py:losses/training_argsfromsentence_transformers.sentence_transformer(top-level submodule path is deprecated in 5.x).
warmup_ratio=→warmup_steps=(v5TrainingArgumentsaccepts afloat < 1.0 there as a fraction of total training steps).
tests/conftest.py: remove_disable_transformers_mistral_regex_patch(the underlying bug is fixed in v5).
tests/embedder/test_sentence_transformers_backend.py:get_sentence_embedding_dimension()→get_embedding_dimension().Test plan
Verified locally on Python 3.14:
pytest tests/embedder— 83 passedpytest tests/modules/scoring/{test_dnnc,test_description_cross,test_rerank_scorer}— 7 passed (Ranker / CrossEncoder paths)pytest tests/modules/test_dumper.py— 8 passed (HF model save/load)pytest --collect-only— 611 tests collect cleanlyruff checkon changed files — clean🤖 Generated with Claude Code